Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding diagnostics channels to Fetch #2701

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

crysmags
Copy link

@crysmags crysmags commented Feb 5, 2024

This relates to...

This is a follow up to a previous PR [Added diagnostics channels on fetch] (#2210) , this includes [proposed changes] (#2210 (comment)) to emit the start event synchronously at the beginning of the function, and emit the end event right before returning, on all paths.

Rationale

We added five diagnostics channels on fetch to trace the same information as would tracingChannel.tracePromise. Some channels won't make perfect sense but we want to stay consistent with TracingChannel as we want to use it to trace fetch and maybe other functions when TracingChannel become available in the most commonly used versions of node.js. The descriptions of each channel and what gets published to them are detailed in DiagnosticsChannel.md

Use case: enable APM tools to trace fetch

Changes

Added diagnostics channel to fetch
Added tests to diagnostics channel for fetch

Status

Copy link
Member

@KhafraDev KhafraDev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please undo the indent changes please

@crysmags
Copy link
Author

crysmags commented Feb 5, 2024

Could you please undo the indent changes please

With the addition of the try block the indentations were necessary, excluding them the commit will result in lint errors.

Copy link
Member

@KhafraDev KhafraDev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I missed that.. why is the extra try block needed? Wouldn't it be possible to add a .finally() to the p.promise instead?

lib/fetch/index.js Outdated Show resolved Hide resolved
lib/fetch/index.js Outdated Show resolved Hide resolved
lib/fetch/index.js Outdated Show resolved Hide resolved
lib/fetch/index.js Outdated Show resolved Hide resolved
lib/fetch/index.js Outdated Show resolved Hide resolved
lib/fetch/index.js Outdated Show resolved Hide resolved
benchmarks/benchmark.js Outdated Show resolved Hide resolved
@crysmags
Copy link
Author

crysmags commented Feb 6, 2024

I missed that.. why is the extra try block needed? Wouldn't it be possible to add a .finally() to the p.promise instead?

Yes, I can add the finally to the promise, though the initial start event is placed in the first try block

Copy link
Member

@metcoder95 metcoder95 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

lib/core/diagnostics.js Outdated Show resolved Hide resolved
Copy link
Member

@mcollina mcollina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good job. Some nits.

lib/fetch/index.js Outdated Show resolved Hide resolved
lib/fetch/index.js Outdated Show resolved Hide resolved
lib/fetch/index.js Outdated Show resolved Hide resolved
lib/fetch/index.js Outdated Show resolved Hide resolved
lib/fetch/index.js Outdated Show resolved Hide resolved
Copy link
Member

@mcollina mcollina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@mcollina
Copy link
Member

Can you resolve the conflicts with main?

@crysmags
Copy link
Author

@mcollina conflicts have been resolved 👍🏼

Copy link
Member

@mcollina mcollina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@mcollina
Copy link
Member

Since your work started, we have migrated the tests to node:test. Could you converT?

@Uzlopak
Copy link
Contributor

Uzlopak commented Feb 14, 2024

My proposed suggestion would migrate the test from tap to node:test ;)

@crysmags
Copy link
Author

@Uzlopak committed the changes, thanks for that!

Copy link
Member

@mcollina mcollina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

cc @metcoder95

@Uzlopak
Copy link
Contributor

Uzlopak commented Feb 14, 2024

Interesting. The unit fests found a valid bug.

@Uzlopak

This comment was marked as outdated.

lib/core/diagnostics.js Outdated Show resolved Hide resolved
lib/core/diagnostics.js Outdated Show resolved Hide resolved
lib/core/diagnostics.js Outdated Show resolved Hide resolved
lib/core/diagnostics.js Outdated Show resolved Hide resolved
lib/core/diagnostics.js Outdated Show resolved Hide resolved
lib/core/diagnostics.js Outdated Show resolved Hide resolved
@Uzlopak

This comment was marked as outdated.

Copy link
Member

@metcoder95 metcoder95 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor detail, and LGTM 👍

lib/core/diagnostics.js Outdated Show resolved Hide resolved
@mcollina
Copy link
Member

Thanks for the great work and keeping up with us!

Copy link
Member

@KhafraDev KhafraDev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I won't block but I am a huge -1 on this. "Node-ifying" fetch will turn it into one of the other half-dozen implementations node has that aren't really maintainable or maintained.

@mcollina
Copy link
Member

@tsctx ptal


I won't block but I am a huge -1 on this. "Node-ifying" fetch will turn it into one of the other half-dozen implementations node has that aren't really maintainable or maintained.

This PR changes around 80 lines of code in the actual algorithm. It doesn't seem this is making fetch() unmaintainable.

@mcollina
Copy link
Member

@jasnell can you take a quick look too?

Copy link
Member

@RafaelGSS RafaelGSS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Copy link
Member

@jasnell jasnell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks reasonable to me

@@ -114,6 +122,75 @@ if (undiciDebugLog.enabled || fetchDebuglog.enabled) {
isClientSet = true
}

// Track fetch requests
if (fetchDebuglog.enabled && diagnosticsChannel.tracingChannel) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this code be placed by users instead of having a helper inside undici core?


// subscribersCheck will be called at the beginning of the fetch call
// and will check if we have subscribers
function subscribersCheck () {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK tracingChannel.hasSubscribers will do all these checks for you. No need to do it manually.

https://github.com/nodejs/node/blob/main/lib/diagnostics_channel.js#L280

cc: @Qard

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tracingChannel.hasSubscribers getter is fairly new so this needs to continue doing things the old way for awhile.

lib/web/fetch/index.js Show resolved Hide resolved
@KhafraDev
Copy link
Member

It absolutely will make maintaining fetch harder.

@mcollina
Copy link
Member

How? For the most part this is just an indentation level change.

@KhafraDev
Copy link
Member

It's not the code itself, it's how intrusive it is. We are taking a web spec and shoving node-specific things into it, things that are not entirely compatible with the spec (I assume that's why you said "for the most part"). There are steps that are rewritten and things added to fetch itself. Every, every single time we have even slightly deviated from the spec it always comes back to bite us in the ass later on, even the smallest change to the spec makes maintaining it harder. I can find specific examples but I hope you'll trust me on that...

@tsctx
Copy link
Member

tsctx commented Jun 12, 2024

@crysmags I think it could be implemented with fewer changes. In the way... tsctx@56c36f7

@tsctx
Copy link
Member

tsctx commented Jun 12, 2024

Please update the document.

@tsctx Which document are you referring to?

Documentation of how to use the added debuglog's.

@Uzlopak
Copy link
Contributor

Uzlopak commented Jun 12, 2024

Yeah, I also think that minimal invasiveness like the solution of @tsctx is more preferable.

That was also my thought with #2857

But at one point you added code with runStores() and I did not investigate why that is now the preferred way to do it.

@Qard
Copy link
Member

Qard commented Jun 12, 2024

The runStores method is required for AsyncLocalStorage to be able to alter its state to a value produced from Tracing Channel. APMs use this to transform Tracing Channel event data into our Span instances. The version proposed by @tsctx would not be usable by APM vendors due to the lack of runStores, and also has incorrect end event timing as it needs to be the sync end of the function call, not the start of the next tick.

@mcollina
Copy link
Member

Should we do a call to discuss this?

@Qard
Copy link
Member

Qard commented Jun 13, 2024

I'm happy to get on a call with folks about it, if that is needed.

@mcollina
Copy link
Member

Here is a doodle: https://doodle.com/meeting/participate/id/dwRDjnJb.

@Uzlopak
Copy link
Contributor

Uzlopak commented Jun 14, 2024

Who is also invited? Am i also invited?

@mcollina
Copy link
Member

Yes

@Qard
Copy link
Member

Qard commented Jun 14, 2024

Is it intentional that the selected dates in the Doodle don't start until July? Seems like quite awhile to wait to make a decision on this. 🤔

@mcollina
Copy link
Member

I'm traveling the next 2 weeks.

@Qard
Copy link
Member

Qard commented Jun 14, 2024

Fair enough. We can find time after your travels then. Safe travels! 👋🏻

@Qard
Copy link
Member

Qard commented Jul 18, 2024

What's happening with this? We still planning a meeting of some sort? @mcollina

@cola119
Copy link
Member

cola119 commented Aug 3, 2024

Any updates? I'm looking forward to this feature for network inspection fetch support.

@Qard
Copy link
Member

Qard commented Nov 7, 2024

@KhafraDev can we do a meeting next week to figure out what can be done to unblock this? There are several companies asking me to make this happen.

@KhafraDev
Copy link
Member

I don't have time - the semester is ramping up for me and finals are happening in a little over a month. There are still issues I have with this PR:

The further from the spec we deviate, and the more node-specific features we implement, the harder it becomes to maintain. It also becomes harder for other runtimes to maintain compatibility. Fortunately, undici (outside of the web specs) provides a fantastic api that fits these requirements, or at least is more open to changes that would satisfy them.

@rochdev
Copy link

rochdev commented Nov 11, 2024

This has gone unresolved https://github.com/nodejs/undici/pull/2701/files#r1636015825

This is an easy change that we can make.

#2701 (comment)

Nothing really changes with the API or the functionality, the PR simply adds observability to it, which is a cross-cutting concern. I would also argue that observability should be baked into at the very least any library doing IO, which undici does. What would be a way forward if the diagnostics capability provided by the runtime cannot be considered?

@KhafraDev
Copy link
Member

I would also argue that observability should be baked into at the very least any library doing IO, which undici does

https://github.com/nodejs/undici/blob/main/docs/docs/api/DiagnosticsChannel.md

Nothing really changes with the API or the functionality

Emitting a warning when adding more than 10 listeners to an EventTarget doesn't change the API or the functionality, but it still causes issues like nodejs/node#54758. There is more to consider when making changes to specs, especially living standards.

What would be a way forward if the diagnostics capability provided by the runtime cannot be considered?

fetch is a browser spec, the better option is to use the other APIs undici provides.

@Qard
Copy link
Member

Qard commented Nov 12, 2024

I've added a comment which covers that. Seems like an easy enough change to make, though I also don't agree on the significance--Node.js core is wildly inconsistent about promise timing, being a tick off is going to make zero difference and microtask timing is really not something that should be depended on in the first place. V8 itself has changed its microtask timing a few times as the spec has evolved, so depending on that timing is just not a good idea.

Do you have a concrete and actionable objection to provide for that? What about the current code is too intrusive? What can we do to make it less so? Yes, this is doing node-specific things in a web spec implementation, but this is a web spec implementation for use in Node.js so I don't see how that's a concern. What about the changes are logically inconsistent with the spec?

Yes, adding this increases code complexity a bit, but this capability is needed by all observability providers which are in use by almost all Node.js applications. The majority of users of undici need this capability.

fetch is a browser spec, the better option is to use the other APIs undici provides.

That provides a view of the internals, not of the interface actually being used by the end user. It's not a useful level to be capturing for observability that is useful and actionable to the user.


This PR has been stuck in limbo for a long time. Can we have the @nodejs/undici maintainers please make a vote to decide how we proceed with this?

santigimeno added a commit to nodesource/nsolid that referenced this pull request Nov 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.